Skip to content

TLS program header is not correctly generated for zero-initialized tls variables#1788

Open
ostylk wants to merge 2 commits into
wild-linker:mainfrom
ostylk:fix/tls
Open

TLS program header is not correctly generated for zero-initialized tls variables#1788
ostylk wants to merge 2 commits into
wild-linker:mainfrom
ostylk:fix/tls

Conversation

@ostylk

@ostylk ostylk commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

I noticed the tls-common test failed on my machine depending on which environment and versions of compilers I used. It seems like the TLS program header is generated incorrectly for zero initialized tls variables.

E.g. in the tls-common we have a variable tvar that is in .tbss. However, the TLS program header is filesz=4,memsz=4 which doesn't zero-initialize it but loads bytes from the elf binary. It just so happens that all CIs and your local setup contained zeros in the binary at the exact place for this to nevertheless work. The correct TLS program header would be filesz=0,memsz=4.

So this pull request does/will do two things:

  • Implement program header assertions in testing so this bug actually errors out deterministically. Now a test can contain //#ExpectProgramHeader:PT_TLS filesz=0,memsz=4 and it will only accept if a program header with at least these properties exists.
  • Fix the actual bug.

This is a draft because the test option still needs documentation and I haven't found the bug yet and holidays are starting. So I'll complete it at a later time.

@davidlattimore

Copy link
Copy Markdown
Member

Hi @ostylk. Do you intend to return to this PR?

@ostylk

ostylk commented May 18, 2026

Copy link
Copy Markdown
Contributor Author

Hi, thank you for the reminder, this PR slipped my mind.
I do intend on returning to it, preferably in this month. Am I impeding anything by working on this slowly?

@davidlattimore

Copy link
Copy Markdown
Member

All good, I was just checking in on PRs that hadn't been touched in a while. I don't think there's any particular hurry on this though.

@ostylk

ostylk commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Ready for review but I have some comments/questions:

  1. The CI failure is weird, it only happens on opensuse (I could reproduce it locally with a container). Any pointers on how to approach that?
  2. There was a comment in output_section_id.rs about how uninitialized TLS data is padded with zeroes in the output file. However, I couldn't find where that happens in the code? Maybe it was forgotten and that caused this bug in the first place?
  3. Funnily enough GNU ld seems to have a bug (to see for yourself: delete the SkipLinker instruction in tls-custom.c test) where uninitialized TLS sections overlap initialized TLS sections. LLD seems to handle the case correctly though.

@ostylk ostylk marked this pull request as ready for review June 26, 2026 14:27
@davidlattimore

Copy link
Copy Markdown
Member

Welcome back!

Regarding the CI failure - RexMovIndirectToAbsolute vs MovIndirectToLea, that sounds like something that could be caused by different defaults as to whether the binary is position independent or not. So possibly the difference might go away if you override those defaults - e.g. by passing a flag to make both linkers produce a position-independent binary.

The PR in which we started putting zero bytes in the file for .tbss was #489. It unfortunately looks like I didn't add a test for it, so it's not impossible that something around there could have regressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants